cognitive profile
PrivacyCD: Hierarchical Unlearning for Protecting Student Privacy in Cognitive Diagnosis
Hou, Mingliang, Wang, Yinuo, Guo, Teng, Liu, Zitao, Dou, Wenzhou, Zheng, Jiaqi, Luo, Renqiang, Tian, Mi, Luo, Weiqi
The need to remove specific student data from cognitive diagnosis (CD) models has become a pressing requirement, driven by users' growing assertion of their "right to be forgotten". However, existing CD models are largely designed without privacy considerations and lack effective data unlearning mechanisms. Directly applying general-purpose unlearning algorithms is suboptimal, as they struggle to balance unlearning completeness, model utility, and efficiency when confronted with the unique heterogeneous structure of CD models. To address this, our paper presents the first systematic study of the data unlearning problem for CD models, proposing a novel and efficient algorithm: hierarchical importance-guided forgetting (HIF). Our key insight is that parameter importance in CD models exhibits distinct layer-wise characteristics. HIF leverages this via an innovative smoothing mechanism that combines individual and layer-level importance, enabling a more precise distinction of parameters associated with the data to be unlearned. Experiments on three real-world datasets show that HIF significantly outperforms baselines on key metrics, offering the first effective solution for CD models to respond to user data removal requests and for deploying high-performance, privacy-preserving AI systems.
- Asia > China > Jilin Province > Changchun (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Africa > Guinea > Kankan Region > Kankan Prefecture > Kankan (0.04)
- Information Technology > Security & Privacy (1.00)
- Education > Health & Safety > School Safety & Security > School Violence (0.40)
Systematic Diagnosis of Brittle Reasoning in Large Language Models
A central question in artificial intelligence is the extent to which machine learning models comprehend mathematics. To address this, we propose a novel framework for measuring mathematical reasoning that moves beyond standard benchmarks to diagnose specific failure points. Our method first generates structured, step-by-step reasoning from gpt-3.5-turbo on the GSM8K dataset. We then use a more capable analyst model, gpt-4o-mini, to categorize errors and, crucially, perform an unsupervised clustering of every reasoning sentence to identify emergent "reasoning modes." This analysis reveals a cognitive profile with a stark, nonhuman-like brittleness: while the model achieves near-perfect accuracy on procedural modes like sequential calculation, its performance on modes requiring combinatorial reasoning with restrictions plummets. By identifying and quantifying the reliability of these distinct reasoning skills, our work provides a more granular method to evaluate mathematical comprehension and offers a precise roadmap for developing new capabilities and more reliable future applications.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > San Diego County > La Jolla (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)
11Plus-Bench: Demystifying Multimodal LLM Spatial Reasoning with Cognitive-Inspired Analysis
Li, Chengzu, Wu, Wenshan, Zhang, Huanyu, Li, Qingtao, Gao, Zeyu, Xia, Yan, Hernández-Orallo, José, Vulić, Ivan, Wei, Furu
For human cognitive process, spatial reasoning and perception are closely entangled, yet the nature of this interplay remains underexplored in the evaluation of multimodal large language models (MLLMs). While recent MLLM advancements show impressive performance on reasoning, their capacity for human-like spatial cognition remains an open question. In this work, we introduce a systematic evaluation framework to assess the spatial reasoning abilities of state-of-the-art MLLMs relative to human performance. Central to our work is 11Plus-Bench, a high-quality benchmark derived from realistic standardized spatial aptitude tests. 11Plus-Bench also features fine-grained expert annotations of both perceptual complexity and reasoning process, enabling detailed instance-level analysis of model behavior. Through extensive experiments across 14 MLLMs and human evaluation, we find that current MLLMs exhibit early signs of spatial cognition. Despite a large performance gap compared to humans, MLLMs' cognitive profiles resemble those of humans in that cognitive effort correlates strongly with reasoning-related complexity. However, instance-level performance in MLLMs remains largely random, whereas human correctness is highly predictable and shaped by abstract pattern complexity. These findings highlight both emerging capabilities and limitations in current MLLMs' spatial reasoning capabilities and provide actionable insights for advancing model design.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > Austria > Vienna (0.14)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences
Haller, Patrick, Bolliger, Lena S., Jäger, Lena A.
To date, most investigations on surprisal and entropy effects in reading have been conducted on the group level, disregarding individual differences. In this work, we revisit the predictive power of surprisal and entropy measures estimated from a range of language models (LMs) on data of human reading times as a measure of processing effort by incorporating information of language users' cognitive capacities. To do so, we assess the predictive power of surprisal and entropy estimated from generative LMs on reading data obtained from individuals who also completed a wide range of psychometric tests. Specifically, we investigate if modulating surprisal and entropy relative to cognitive scores increases prediction accuracy of reading times, and we examine whether LMs exhibit systematic biases in the prediction of reading times for cognitively high- or low-performing groups, revealing what type of psycholinguistic subject a given LM emulates. Our study finds that in most cases, incorporating cognitive capacities increases predictive power of surprisal and entropy on reading times, and that generally, high performance in the psychometric tests is associated with lower sensitivity to predictability effects. Finally, our results suggest that the analyzed LMs emulate readers with lower verbal intelligence, suggesting that for a given target group (i.e., individuals with high verbal intelligence), these LMs provide less accurate predictability estimates.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Dominican Republic (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.55)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Inferring Capabilities from Task Performance with Bayesian Triangulation
Burden, John, Voudouris, Konstantinos, Burnell, Ryan, Rutar, Danaja, Cheke, Lucy, Hernández-Orallo, José
As machine learning models become more general, we need to characterise them in richer, more meaningful ways. We describe a method to infer the cognitive profile of a system from diverse experimental data. To do so, we introduce measurement layouts that model how task-instance features interact with system capabilities to affect performance. These features must be triangulated in complex ways to be able to infer capabilities from non-populational data -- a challenge for traditional psychometric and inferential tools. Using the Bayesian probabilistic programming library PyMC, we infer different cognitive profiles for agents in two scenarios: 68 actual contestants in the AnimalAI Olympics and 30 synthetic agents for O-PIAAGETS, an object permanence battery. We showcase the potential for capability-oriented evaluation.
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine (0.93)
- Education (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- (2 more...)
Brain wiring could be behind learning difficulties, say experts
Learning difficulties are not linked to differences in particular brain regions, but in how the brain is wired, research suggests. According to figures from the Department for Education, 14.9% of all pupils in England – about 1.3 million children – had special educational needs in January 2019, with 271,200 having difficulties that required support beyond typical special needs provision. Dyslexia, attention deficit hyperactivity disorder (ADHD), autism and dyspraxia are among conditions linked to learning difficulties. Now experts say different learning difficulties are not specific to particular diagnoses, nor are they linked to particular regions of the brain – as has previously been thought. Instead the team, from the University of Cambridge, say learning difficulties appear to be associated with differences in the way connections in the brain are organised.
AI Can Better Predict Why Children Struggle at School
Our understanding of learning difficulties largely comes from children with specific diagnoses or individuals selected from community/clinical samples according to strict inclusion criteria. Applying strict exclusionary criteria overemphasizes within group homogeneity and between group differences, and fails to capture comorbidity. Here, we identify cognitive profiles in a large heterogeneous sample of struggling learners, using unsupervised machine learning in the form of an artificial neural network. Children were referred to the Centre for Attention Learning and Memory (CALM) by health and education professionals, irrespective of diagnosis or comorbidity, for problems in attention, memory, language, or poor school progress (n 530). Children completed a battery of cognitive and learning assessments, underwent a structural MRI scan, and their parents completed behavior questionnaires.
Try these simple mental tests to see if you're a good athlete
Simple mental tests may be able to identify people who are likely to reach the top of their sport. That's according to researchers who showed that elite athletes who play team sports aren't just stronger and faster than the rest of us – some of their cognitive skills are better, too. Young soccer players, competing at the top level in Sweden, performed better than the general population on tests of so-called "executive function". And the better their results, the more goals they scored. Executive function isn't a measure of intelligence – it describes unconscious mental abilities like our working memory, which is involved in manipulating transient information to help us make decisions, and attentional control, which is our ability to choose what to pay attention to and what to ignore.